Algorithmic Aspects of Natural Language Processing
نویسنده
چکیده
Examples of natural languages are Chinese, English and Italian. They are called natural as they evolved in a more or less natural way, without too many deliberate considerations. This sets them apart from formal languages, amongst which are programming languages, which are designed to allow easy processing by computer algorithms. Typically, programs in programming languages such as C or Java can be processed (compiled) in close to linear time in their length. One particular feature that most programming languages have in common, and that allows for their fast processing, is absence of ambiguity. That is, only one structure, called a parse or parse tree, can be assigned to any program, and this parse can have only one meaning. Furthermore, the design of many programming languages is such that the single parse can be found deterministically, which means that every parsing step contributes a fragment of the resulting parse. As parses have a size linear in the length of the input, this explains why parsing is possible in linear time. Subsequent processing of the parse, for example in order to compile to machine code, is also commonly possible in close to linear time. Natural languages are quite different in this respect. Like programs in a programming language, sentences in a natural language can be assigned parses, but often the sentences are ambiguous and allow more than one parse. Even for a single parse, there may be ambiguity in the meanings of words or expressions. The existence of ambiguity in natural language is witnessed by frequent misunderstandings in daily life, but it is also an essential feature of poetry and puns.
منابع مشابه
Composition by Conversation
Most musical programming languages are developed purely for coding virtual instruments or algorithmic compositions. Although there has been some work in the domain of musical query languages for music information retrieval, there has been little attempt to unify the principles of musical programming and query languages with cognitive and natural language processing models that would facilitate ...
متن کاملWeighted Automata in Text and Speech Processing
Finite-state automata are a very effective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We briefly describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an efficient composition alg...
متن کاملWeighted Automata in Text
Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned ...
متن کاملNatural Language Semantics and Computability
This paper is a reflexion on the computability of natural language semantics. It does not contain a new model or new results in the formal semantics of natural language: it is rather a computational analysis of the logical models and algorithms currently used in natural language semantics, defined as the mapping of a statement to logical formulas — formulas, because a statement can be ambiguous...
متن کاملBorder Crossings
It is well established by now that computer science has a number of concerns in common with natural language understanding. Common themes show up in particular with algorithmic aspects of text processing. This chapter gives an overview of border crossings from NLP to CS and back. Starting out from syntactic analysis, we trace our route via a philosophical puzzle about meaning, Hoare correctness...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010